Minimizing Data-Communication Costs by Decomposing Query Results in Client-Server Environments
نویسندگان
چکیده
Many database applications adopt a client-server architecture, in which data resides on a server that receives queries from a client. For each client’s query, the server often needs to transfer to the client a large amount of data that is an answer to the query. The communication network in these environments could become a bottleneck in the computation. In this paper we study how to minimize the communication costs of transferring answers to largejoin queries from server to client. We propose a novel technique that decomposes the answer into intermediate results, or views, which can reduce the redundancy in the answer. These views are transferred to the client and are used by the client to compute the final answer. There are several challenges in implementing this technique: (1) the number of possible plans to decompose the answers could be very large; (2) the technique requires an efficient algorithm to give an accurate estimate of the size of each view; and (3) many factors could affect the decomposition choice; one such factor is whether relevant data is cached on the client. Our extensive experiments on queries adapted from the TPC-H benchmark show that our technique can significantly reduce the communication costs of transferring answers to large-join queries. The extra steps used in our approach do pay off to reduce the total time of transferring the result of a query, when the result has a lot of redundancy.
منابع مشابه
Separating indexes from data: a distributed scheme for secure database outsourcing
Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...
متن کاملAchieving Communication Efficiency through Push-Pull Partitioning of Semantic Spaces in Client-Server Architectures
Client-server databases that require query results to be up-to-date despite storing data that changes dynamically suffer from heavy communication costs. Client-side caching can help mitigate these costs, particularly when individual PUSH-PULL decisions are made for the different semantic regions in the data space. In the PUSH regions the server notifies the client about updates, and in the PULL...
متن کاملCommunication-Efficient Query Answering with Quality Guarantees in Client-Server Applications
We study how to reduce costs in client-server web based applications with dynamic data on the server. Client-side caching can help mitigate costs because the client can use the cached data to answer queries. Allowing some tolerance on the data staleness to answer queries makes it possible to significantly reduce costs. For example, if the user can tolerate data that was received 2 hours ago, we...
متن کاملA Study of Query Execution Strategies for Client-server Database Systems
Query processing in a client-server database system raises the question of where to execute queries to minimize the communication costs and response time of a query, and to load-balance the system. This paper evaluates the two common query execution strategies, data shipping and query shipping, and a policy referred to as hybrid shipping. Data shipping determines that queries be executed at cli...
متن کاملطراحی وب سرویس مدیریت امدادرسانی پس از وقوع سیل با کمک اطلاعات جغرافیایی داوطلبانه (VGI) بر مبنای تکنولوژی متن باز
Accessibility to precise spatial and real time data plays a valuable role in the velocity and quality of flood relief operation and subsequently, scales the human and financial losses down. Flood real time data collection and processing, for instance, precise location and situation of flood victims may be a big challenge in Iran regarding the hardware facilities (such as high resolution aerial ...
متن کامل